-
Notifications
You must be signed in to change notification settings - Fork 1.9k
fix(schema-compiler): Preserve full time buckets in period-over-perio… #9502
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #9502 +/- ##
===========================================
- Coverage 83.89% 59.02% -24.87%
===========================================
Files 229 153 -76
Lines 83569 13067 -70502
Branches 0 2223 +2223
===========================================
- Hits 70111 7713 -62398
+ Misses 13458 5039 -8419
- Partials 0 315 +315
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
ab02440 to
ea13aaa
Compare
|
Hi folks, just following up on this PR. I saw that a review was requested from the schema-compiler team. If there’s anything I can clarify or update, I’d be happy to help. Thanks in advance! |
KSDaemon
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍🏻 LGTM!
…d queries Switches from INNER JOIN to LEFT JOIN in outerMeasuresJoinFullKeyQueryAggregate to prevent dropped rows when previous-period data is missing. Adds fixture and tests for period-over-period measures.
| .map( | ||
| (q, i) => (this.dimensionAliasNames().length ? | ||
| `INNER JOIN ${this.wrapInParenthesis((q))} as q_${i + 1} ON ${this.dimensionsJoinCondition(`q_${i}`, `q_${i + 1}`)}` : | ||
| `LEFT JOIN ${this.wrapInParenthesis((q))} as q_${i + 1} ON ${this.dimensionsJoinCondition(`q_${i}`, `q_${i + 1}`)}` : |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if there are few subjoins? And it's probably possible that the result set with gaps may occur on the right side too... So maybe FULL OUTER JOIN should be used here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right. I was thinking the root q_0 would always have the full time set, but FULL OUTER JOIN will work too. I can make the code and description updates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After more analysis, FULL OUTER JOIN won't work here because it seems PostgreSQL requires merge-joinable or hash-joinable join conditions and dimensionsJoinCondition includes other null-handling logic that prevents that (found by re-running tests after just changing to FULL OUTER JOIN). LEFT JOIN still seems appropriate because we can assume the first query q_0 defines the full time range and that is what is preserved when data is missing from later subqueries. The same logic would apply in cases with few subjoins, q_0 is treated as the time source subquery and LEFT JOIN ensures we retain those rows when there is missing data on the right.
ea13aaa to
715370b
Compare
|
It seems that We won't solve this problem at the baseQuery level right now. We are solving this problem in Tesseract. |
|
Thanks for the discussion and feedback. I’ll go ahead and close this PR and take a look at other ways I can contribute. |
|
@nchristo1213 thank you for your contribution (even though we haven't merged it) and discussion! |
fix(schema-compiler): Preserve Full Time Buckets in Period-over-Period Queries
This PR replaces
INNER JOINwithLEFT JOINwhen stitching sub-queries inouterMeasuresJoinFullKeyQueryAggregate, ensuring all expected time buckets appear when calculating period-over-period (PoP) metrics.Problem
When PoP queries used
INNER JOIN, buckets missing from auxiliary sub-queries (q_1,q_2, ...) were dropped causing incomplete results or incorrect ratios.Solution
Use
LEFT JOINinstead, preserving all keys from the anchor query (q_0). Missing values in auxiliary queries yieldNULLrather than removing the row entirely.Benefits
Tests
salescube withdate,category,region, and PoP measuresCAGR: AddedNULLrowsmulti-stage graphtests: Added secondary orderingThese tests ensure the correct number of periods appear and that PoP measures return
NULLinstead of dropping rows when previous-period values are unavailable.Check List
resolves #9353